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to distinguish the importance of data collected and apply the suitable 

security approach for each type of data collected. This was done by using 
Keywords: hybrid system that combines block cipher and stream cipher systems. After 
data classification using machine learning classifiers the less important data 
are encrypted using stream cipher (SC) that use rivest cipher 4 algorithm, 
and more important data encrypted using block cipher (BC) that use 
advanced encryption standard algorithm. By applying a performance 
evaluation using simulation, the proposed method guarantees that it encrypts 
the data with less central processing unit (CPU) time with improvement in 
the security over the data by using the proposed hybrid system. 
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1. INTRODUCTION 

The internet of things (IoT) is defined as connecting all objects in different environments through 
the internet. These objects collect different data, and sometimes data may be of high importance, whether it is 
about the surrounding environment or the user itself [1]. Therefore, it is necessary to ensure that only the 
receiver can safely recover this information [2] and to protect this information from any risk that may occur 
to it, such as penetration by unauthorized persons or eavesdropping by a third party [3]. Figure 1 shows the 
definition of IoT. To get the widespread of internet IoT obtained that by enabling easy access and 
collaboration with a large number of devices, for example, personal appliances, control cameras, sensors, 
motors, and screens, the IoT will promote development through application, in order to massively use risks, 
and attack information provided by these creatures to provide new services to citizens, companies, and public 
administrations [4]. 

Today, there are many uses of the internet such as making data globally available to authorized 
users and online data processing units. Of course, the data can be sensitive and this violates the privacy of 
users. This risk is exacerbated by the trend to separate the sensor network infrastructure and applications. 
Therefore, a security solution must be provided to achieve an appropriate level of security for the IoT [5]. 
Due to the lower cost and the time of marketing, IoT manufacturers did not give the security issue a priority 
to be part of their IoT devices. Few manufactured devices include a software-based security programs like 
firmware, however, the previous solutions do not take into consideration the different usage patterns of IoT 
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when compared to personal computers, which proves to be irrelevant at times [6]. Moreover, focusing on 
software-based protection systems often leaves the device unintentionally weak, enabling new offensive 
vectors [7]. 
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Figure 1. Definition of IoT [5] 


Network security and data encryption are currently a very important topic in the modern 
communications network research areas. When we send some confidential matters from one customer to 
another customer that data should not be intercepted by an unauthorized person. Cryptography is now an 
emerging research field as scientists try to develop a good encryption algorithm so that no hacker can 
intercept the encrypted message. This means that whenever we want to send messages to someone, they must 
be encrypted so that no one can decrypt them without knowing the key to the decryption process [8] as 


shown in Figure 2. 


Figure 2. Encryption and decryption process [9] 


For this type of security, two methods of encryption/decryption process are introduced: Symmetric 
and asymmetric. In symmetric encryption one key is used for both operation encryption/decryption data [10]. 
This security key implemented in algorithms which classified into either stream or block ciphers also 
depends on the size of the key. Stream cipher has two main components: the mixing function and the key 
stream. The first component is the exclusive OR (XOR) function, while the second component is the 
generator which considered the main unit of the stream cipher (SC) encryption. Block cipher (BC) algorithms 
that generalize N-bit blocks of plaintext data under the secret key selection and generate N-bit blocks of 
encrypted data for anything else [11]. 

In the last few years, the stream cipher has been widely used, to be replaced by a block cipher. This 
is due to a number of reasons, including security, which is one of the weaknesses of the stream cipher and is 
much lower than the security provided by the block cipher. The other reason is the efficiency that has been 
reduced in many applications where the stream cipher is used so it had to be addressed to solve this problem 
[12]. In this work, we compared the SC, BC and hybrid system methods shown in Figure 3. Rivest cipher 4 
(RC4) is used for Stream cipher, while AES for block cipher. We designed a hybrid system that take 
advantages from both block and stream cipher [13]. The stream cipher [14] consists of an initial step, called 
the warm-up phase, which produces a key and an internal IV value that will produce the first output bit or 
bytes. The time required to perform "Key Setting", and "IV setting" is then tested. Moreover, one of the main 
advantages of a stream cipher is that it is able to produce long sequences at a high speed required for the 
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encryption process. A stream cipher is usually used when wireless communication is required because it can 
reach significant flows for limited costs and the use of encryption "one-time panel" does not deploy errors 
caused by the channel connection. Block cipher [15] is a type of symmetric encryption that works on blocks 
of data. Modern block blades typically use a block length of 128 bits or more, including data encryption 
standard (DES), advanced encryption standard (AES), RC6 and international data encryption algorithm 
(IDEA) that supports key sizes of 128, 192, and 265 bits [16]. There are supported symmetrical key block 
encryption algorithms that have a 128-bit block size and cannot be used with a 64-bit block size such as 
cipher block chaining-message (CCM) authentication code algorithm [17], in block cipher the length of the 
plaintext is known, block cipher must be used in ciphertext stealing or residual block termination mode to 
avoid padding [18]. When the block of data that the BC want to encrypt/decrypt it shorter than the block size 
then BC cannot directly work on it. 


Symmetric 
cryptography 


Figure 3. Block and stream algorithms considered for this work [19] 


With the rapid increase in the volume of information, text classification has become an important 
issue in dealing with this huge volume of data. Text classification techniques are used to categorize news 
stories, to find interesting information on world wide web (WWW), and to guide user search through 
hypertext [20], [21]. The most common classifiers are: support vector machine (SVM), naive Bayes (NB) and 
K-nearest neighbor (KNN) [22] that divide data into two parts: important and more important. In the paper 
[23], [24] the authors compare block cipher algorithms that include: IDEA, Blowfish, RC2, Serpent, Cast5, 
RC6, and stream cipher algorithms that include: Salsa 20, HC-128, VMPC, RC4, HC-256, Grain, in terms of 
CPU time and productivity as shown in Table 1. They concluded that the SC is faster than the BC. 


Table 1. Comparison between block cipher algorithms and stream cipher algorithms [24] 


AES 3DES DES Cast-5/Blowfish VMPC/Salsa 20 
Key length 128,192, or (k1, k2, k3)168 bits 56 bits 128,192, and 258 128,192, and 258 
256 bits (k1 and k2 is bits bits 
same)112 bits 
Cipher type Symmetric Symmetric block Symmetric Symmetric block Symmetric stream 
block cipher cipher block cipher cipher cipher 
Block size 128,192, or 64 bits 64 bits 64 bits 128 bits 
256 bits 
Security Considered One only weak Proven Provide stronger Provide stronger 
secure which is exit in DES inadequate security security 
Number of 16 rounds 48 rounds 16 rounds Fixed 8, 12 or 20 rounds 
Rounds 


It is a machine learning algorithm; the goal of this algorithm is to find a hyper plane that classify 
data points in a dimensional space. Hyper plane dimension relies on features number. Data points is 
considered support vectors because it helps in building the support vector machine model. The output of 
SVM is a hyper plane that separate classes and classify new data points. The setting parameters in SVM are 
kernel, regularization, gamma, and margin. Kernel transform the problem using linear algebra to learn the 
hyper plane in linear SVM model. In linear kernel we predicate the new instance using this equation that 
compute the products of new instance vector with each support vector in the training data. 


F(x) = B(O) + sum (ai * (x,xi)) (1) 


A dynamic data encryption method based on addressing the data importance on ... (Dana Khwailleh) 


2142 O ISSN: 2088-8708 
In the polynomial kernel the prediction is done using (2). 

K (x,xi) = 1 + sum (x * xi) ^d (2) 
In the exponential kernel the prediction is done using (3). 

K (x,xi) = exp (—gamma * sum ((x— xi’)) (3) 


The regularization parameter specifies how much to keep away from errors in classification. Large 
values give higher accuracy, smaller values give lower accuracy results. Gamma parameter specify the 
closeness of points to the separation line. Low gamma value that data point is far from the separation line and 
high gamma value mean that data point is close to the separation line. Margin is the distance between the line 
and the closest data points, larger margin value is a good to avoid crossing multiple classes [25]. It is a 
supervised machine learning algorithm that calculate feature probabilities and choose the feature with the 
highest probability. In this rule P (A|B) consider the probability of A given that B happen. This classifier 
gives a good performance recommender systems and text classification. Naive Bayes take into account that 
features are independent. 


P (A|B) = (P(B |A) P(A))/P(B) 


Multinomial naive Bayes is a version from naive Bayes classifier that suppose the independency of 
features and between attributes, also it gives efficient performance [25]. It is a classification technique that 
depend on neighbor’s majority voting, new input is assigned with the common class label of its neighbors. 
K is number of neighbors to be considered in voting. After applying classification for the testing domain, we 
calculate the performance of our classifier and use the evaluation metrics precision, recall, f-measure, and 
accuracy [25]. 

— Precision = TP/(TP + FP) 
— Recall = TP/(TP + FN) 
— Accuracy =TP+TN/(TP+TN+FP+FN) 
F — measure = 2 * (Precision * Recall/(Precision + Recall)) 

We find that KNN give us the best results when we increase the training set. In this paper, the 
proposed method and methodology is presented in detail in the second section. In the third section, the 
simulation environment and the result discussion are presented and explained. By the last section the article 
idea is concluded. 


2. THE PROPOSED METHOD 

As mentioned, securing data collected from IoT devices should be classified according to its 
importance to be able to get fast and accurate security output. The proposed method starts with classifying 
the data sets collected from IoT devices by using the most appropriate machine learning classifier algorithm 
then each type of classified data is inserted into the relevant type of security method depends on the data 
importance. 


2.1. Data set 

The data that used in this article can be accessed from [26], this data set has two classes: Normal 
patients which represents 100 patients and Abnormal patients which represents 210 patients. Each raw 
represents a patient attribute (6 attributes): pelvic incidence, pelvic tilt, lumbar lordosis angle, sacral slope, 
pelvic radius, and grade of spondylolisthesis. 


2.2. Data classification 

Selecting the best classification method to categorize the data is a very important step and the 
section should be done depending on our own experiments or previous study. So, we went to machine 
learning algorithms, we tested three classifiers: KNN, SVM, and NB [27]. We used Waikato environment for 
knowledge analysis (WEKA) which is free software that contains tools and algorithms for data analysis. 
Which is used to train the classifiers to get the performance for each one. We made 10 times cross-validation 
in the data with different training set sizes 60, 70, 80 and with 70% training data and 30% test data for each 
classifier. We found that the NB ratios were almost constant even when the training set increased, The SVM 
was high at first, but when the training set increased, the ratios were significantly lower. But the KNN was 
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better than the NB and SVM, the accuracy increases even when the training set is increased as shown in 


Figure 4. So, we used KNN to categorize important data to be encrypted using block cipher, the less 
important data is encrypted using the stream cipher. 
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Figure 4. Comparison between NB, SVM, and KNN classifiers 


2.3. Proposed hybrid system architecture 

The hybrid system consists of two ciphers: SC and BC. First, we added some improvements to the 
BS, where we divided the data into a variable block size to get rid of the padding. We used a different key to 
encrypt each block to increase protection on the data, we used CBC mode and combined AES algorithm with 
it. As far as SC is concerned, we have reduced the size of the data by almost half, so they will not need much 
time to encrypt it, and we used RC4 algorithm with SC. 

Our system passing the data (plaintext) to the KNN classifier to classify it into two parts: important 
and more important. Then more important data encrypted using block cipher that use AES. AES has a fixed 
block size of 128 bits, key size of 128, 192, or 256 bits. AES runs on a 4x4 column-main order array of 
bytes, called the state. Most AES accounts are made in a limited selected field. Each round consists of several 
processing steps, including: SubBytes where each byte is replaced with another according to a search table, 
ShiftRows where the last three rows of the state are periodically converted a certain number of steps, 
MixColumns it works on its state columns, combining the four bytes in each column, AddRoundKey. Each 
byte of the state is combined with a block of round key using XOR and finally KeyExpansion where 
encryption key produced the round keys [28], [29]. While less important data encrypted using stream cipher 
that use RC4. RC4 creates a key stream. Any current ciphers can be used by combining it with plain text 
using exclusive XOR. To create a key stream, encryption use the secret internal state that consists of two 
parts: a permutation of all 256 possible bytes which is configured using a variable length key, usually 
between 40 and 2048 bits, using the key scheduling algorithm (KSA), and two 8-bit index pointers which is 
generated using the pseudo-random generation algorithm (PRGA) [30], to finally get our encrypted data 
(ciphertext) as shown in Figure 5. The previous technique is used in the same way for decryption. 
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Figure 5. Hybrid system encryption and decryption 
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3. SIMULATION AND RESULTS 
3.1. Simulation parameter 

Table 2 presents the features of stream cipher (RC4) and block cipher (AES) which will be used and 
compared with the proposed method. The main goal of comparison is the CPU processing time and the 
impact of the proposed method in increasing the security when we use important data as well as reducing the 
central processing unit (CPU) time. 


Table 2. Comparing AES and RC4 [30] 


Algorithm AES RC4 

Block size 64 bits and more 8 bits 

Key size 128/192/256 bits 1-256 bits 
Key Schedule Complex Simple 
Complexity Simple design Complex comparatively 


3.2. Experiment setting and challenges 

In this study, a desktop computer 2.00 GHz processer, with 16.00 GB RAM operating under 
Windows was used and Java integrated development tool is used. It also supports various programming 
languages such as Python, Scala, and Java. As mentioned earlier the main goal of this work is to compare the 
performance of stream, block, and hybrid algorithms. In order to carry out the following tasks: 1) use KNN 
algorithm to classify the data set that we used it in this work; ii) calculate the encryption/decryption time of 
each algorithm using input files of different sizes; and iii) calculate the encryption/decryption time (CPU 
processing time) of each algorithm using input files of different sizes. 

One of the challenges we encountered in this study is that the amount of data collected takes a long 
time and finding large and comfortable data is not easy. Also, the data has to be visualized in graphs which 
take a long time to do manually. So, we used Microsoft Excel to speed it up. Also, one of the problems we 
encountered was that when writing code using Scala, we had no knowledge of it and had to attend courses to 
be able to reach the quality of the code we aspire to. 


3.3. Results 

By applying an extensive performance evaluation for the proposed method by injecting the emulator 
with different datasets size and then evaluate the CPU time as a performance measure. In Figure 6, we show 
the efficiency of security algorithms in terms of encryption time in different data size. You can see that time 
to encrypt files using proposed hybrid algorithm is less than the other two algorithms used in this study this 
improvement is because the proposed approach classifies the data according to importance and use the 
appropriate encryption algorithm which at the end reflect the time and performance of proposed algorithm 
against other algorithms. Regarding the efficiency of security algorithms in terms of decryption time in 
different data size, the time to encrypt files using proposed hybrid algorithm is less than the other two 
algorithms going closer to the SC algorithm which comes from the way the proposed approach decrypts the 
data according to the way back of encryption by using the best algorithm which at the end gives the proposed 
approach a step better than other algorithms. 
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Figure 6. Encryption time of RC4, AES and hybrid 
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4. CONCLUSION 

In this dynamic method, different encryption algorithms have been tested and their efficiency has 
been compared in term of encryption/decryption time. Block cipher showed the worst results compared to the 
algorithms that were implemented in this article. There is also good evident that the stream cipher generally 
takes less time than the block cipher for encryption/decryption. So, we develop a hybrid system that take 
advantages from both SC and BC to decrease encryption/decryption time and processing time and increase 
data security. We use ML classifiers: NB, SVM, and KNN to classify data into important and more important 
parts. KNN give us the best results 76.34%. Then hybrid system encrypt/decrypt important data using SC, 
and more important data using BC; to ensure that the size of data cut to a half. To get the best results from 
hybrid system with regard to encryption/decryption time and CPU time. 

The use of block cipher and stream cipher is not limited to text-only. Since image encryption 
eliminates an important role in hiding information. Therefore, it is important to protect image data from 
unauthorized access. They proposed a new method based on the stream cipher for selective encryption for 
256 colors and gray color images based on encryption discrete cosine transform (DCT) transactions. Another 
method based on the block cipher called RDH-EI method. As a future work, it is expected that the proposed 
approach will be tested and improved to be used with images to classify and encrypt especially for health 
platform with patient images. 
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